# Low Memory Consumption
Apriel Nemotron 15b Thinker GGUF
MIT
Apriel-Nemotron-15b-Thinker is a powerful inference model that performs excellently among models of the same scale. It has efficient memory usage and excellent inference capabilities, making it suitable for various enterprise and academic scenarios.
Large Language Model
Transformers

A
Mungert
1,097
1
Optical Flow MEMFOF Tartan T TSKH
Bsd-3-clause
MEMFOF is a memory-efficient optical flow estimation method designed for full HD videos, combining high precision and low memory usage.
Video Processing
PyTorch English
O
egorchistov
201
2
FLUX.1 Dev ControlNet Union Pro 2.0 Fp8
Other
This is the FP8 quantized version of the Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0 model, quantized from the original BFloat16 format using PyTorch's native FP8 support to optimize inference performance.
Image Generation English
F
ABDALLALSWAITI
2,023
15
Deepseek R1 Distill Qwen 1.5B
MIT
Multiple variants based on DeepSeek-R1-Distill-Qwen-1.5B, adapted for LiteRT framework and MediaPipe LLM inference API, deployable on Android platforms.
Large Language Model
D
litert-community
138
4
Llama 3.2 3B Instruct Unsloth Bnb 4bit
An efficient large language model optimized with Unsloth's dynamic 4-bit quantization technology, based on Meta Llama 3.2-3B-Instruct
Large Language Model
Transformers English

L
unsloth
240.35k
9
Universal NER UniNER 7B All Bnb 4bit Smashed
PrunaAI's compressed version of the UniNER-7B-all model, significantly reducing memory usage and energy consumption through quantization techniques while maintaining good named entity recognition capabilities.
Large Language Model
Transformers

U
PrunaAI
22
1
Featured Recommended AI Models